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Abstract. Quasi-median graphs are a tool commonly used by evolutionary 
biologists to visualise the evolution of molecular sequences. As with any graph, 
a quasi-median graph can contain cut vertices, that is, vertices whose removal 
disconnect the graph. These vertices induce a decomposition of the graph into 
blocks, that is, maximal subgraphs which do not contain any cut vertices. Here 
we show that the special structure of quasi-median graphs can be used to com- 
pute their blocks without having to compute the whole graph. In particular we 
present an algorithm that, for a collection of n aligned sequences of length m 
over an alphabet of I letters, can compute the blocks of the associated quasi- 
median graph together with the information required to correctly connect these 
blocks together in run time 0(l 2 n 2 m 2 ). Our primary motivation for presenting 
this algorithm is the fact that the quasi-median graph associated to a sequence 
alignment must contain all most parsimonious trees for the alignment, and there- 
fore precomputing the blocks of the graph has the potential to help speed up 
any method for computing such trees. 



1. Introduction 

Quasi-median graphs are a tool commonly used by evolutionary biologists to 
visualise the evolution of molecular sequences, especially mitochondrial sequences 
(Schwarz and Diir [T7]; Ayling and Brown PQ; Bandelt et al. [5]; Huson et al. [15j 
Chapter 9]). Their application to molecular sequence analysis was introduced for 
binary sequences in (Bandelt et al. [5]) and for arbitrary sequences in (Bandelt et 
al. [1]). A quasi-median graph can be constructed for an alignment of sequences 
over any alphabet [3]; for binary sequences they are also known as median graphs 
(Bandelt et al. [5]). An example of a quasi- median graph associated to the hypo- 
thetical alignment of sequences si-sy is presented in Figure (see Bandelt and 
Diir [3] for more details on how to construct such graphs). 

Here we are interested in computing the cut vertices of a quasi-median graph as 
well as an associated decomposition of the graph. Recall that given a connected 
graph G = (V(G), E(G)), consisting of a set V = V(G) of vertices and a set E = E{G) 
of edges, a vertex v 6 V is called a cut vertex of G if the graph obtained by deleting 
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Figure 1.1. An alignment of hypothetical DNA sequences and the 
associated quasi-median graph. The sequences correspond to the 
black vertices and the columns correspond to the edg indicated 
by the labels. 



v and all edges in E containing v from G is disconnected (for the basic concepts 
in graph theory that we use see, for example, (Diestel [9])). For example, the 
cut vertices of the quasi-median graph in Figure |1.1| are represented by white 
vertices. As with any graph, the cut vertices of a quasi-median graph decompose 
it into blocks, that is, maximal subgraphs which do not contain any cut vertices 
themselves. These blocks in turn, together with the information on how they are 
linked together, give rise to the block decomposition of the graph (see Section [H] 
for a formal definition of this decomposition that we shall use which is specific to 
quasi- median graphs). The main purpose of this paper is to provide an algorithm 
for computing the block decomposition of a quasi-median graph without having to 
compute the whole graph. 

The results in this paper complement the well-developed theory of quasi-median 
networks (cf., e.g., (Bandelt et al. [8]; Imrich and Klavzar [IB])). However, our pri- 
mary motivation for computing the block decomposition of quasi-median graphs is 
provided by their close connection with most parsimonious trees (see, e.g., Felsen- 
stein [13] for an overview of parsimony). Indeed, Bandelt and Rohl [7] showed 
that the set of all most parsimonious trees for a collection of (aligned, gap-free) 
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sequences must be contained in the quasi-median graph of the sequences (see also 
(Bandelt [2]) for a proof of this result for median networks). More specifically, they 
showed that the most parsimonious trees for the sequences are in one-to-one cor- 
respondence with the Steiner trees for the sequences considered as a subset of the 
vertices of the quasi-median graph. It easily follows that the block decomposition 
of a quasi-median graph can be used to break up the computation of most parsimo- 
nious trees into subcomputations on the blocks. Of course, the quasi-median graph 
of an arbitrary collection of sequences may not contain any cut vertices but, as 
computing most parsimonious trees is NP-hard (Foulds and Graham [H]), it could 
still be a useful pre-processing step to compute the cut vertices of quasi-median 
graphs before trying to compute most parsimonious trees. 

We now summarise the contents of the rest of this paper. We begin by pre- 
senting some preliminaries concerning quasi-median graphs in the next section. 
Then, in Section [3j we recall a characterisation of the vertices of a quasi-median 
graph given in (Bandelt et al. |6J), which we use in Section [4] to prove a key 
structural result for quasi- median graphs (Theorem 4.1). This result is a direct 
generalisation of Theorem 1 of (Dress et al. [6]) for median graphs, and states 
that the blocks in a quasi-median graph are in bijection with the connected com- 
ponents of a certain graph which can be associated to an alignment that captures 
the degree of "incompatibility" between its columns. Using this result, we also de- 
rive a characterisation of the cut vertices of a quasi-median graph (Theore m |4.6[ ). 
After defining the block decomposition of a quasi-median graph in Section [5j we 
present our algorithm for its computation in Section [6] (Algorithm 1). In partic- 
ular, we prove that this algorithm correctly computes the block decomposition 
(Theorem 6.1) and also show that, for a collection of n aligned sequences of length 
m over an alphabet with / letters, the algorithm's run time is 0(l 2 n 2 m 2 ) (Theo- 
rem |6.3 ) . We have implemented the algorithm and it is available for download at 



http: //www.uea. ac .uk/cmp/research/cmpbio/quasidec, 



2. Preliminaries 

In the following we shall define quasi-median networks in terms of partitions 
rather than sequences, as explained in [6] . It is quite natural to do this since, given 
a multiple sequence alignment as in Figure |1.1[ each column of the alignment gives 
rise to a partition of the set of sequences in which all those sequences having the 
same nucleotide in the column are grouped together (note that columns with only 
one nucleotide are usually ignored). In particular, by also recording the number of 
columns giving rise to a specific partition, alignments can be recoded in terms of 
sets of partitions of the sequences. This whole process is described in more detail 
in, for example, (Bandelt and Diir [3]). 

We now recall how quasi-median networks can be defined in terms of partitions. 
For the rest of this paper let X denote an arbitrary, non-empty finite set. A 
partition P of the set X is a collection of non-empty subsets of X whose union is X 
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and for which A n B = for all A, B £ P. For x £ X we set P(x) to be the unique 
element of P that contains x. 

Example 2.1. Consider the set X = {^i, s%, . . . ,S\$) of sequences given in Fig- 
ure The columns labelled 1, . . . , 12 give rise to the partitions P\,P%, . . . , P\ 2 of 



X, respectively. For example, 

, S4, S5, 5g, 59, 
^6 = ^2, S3, &b ^7, ^g}, {*5}j { s 6> *lo}} 

and the element of P-i containing S(, is given by 

Pl(S6) = {^l* S 2 , S5, 57, 5g, Sio). 

Let V be an arbitrary set of partitions of X, also called partition system on X. 
A P-map is a map v : 'P — > that maps every partition in y to one of its parts. 
Note that, given any lei, the map v x : : *P — > 2 X given by setting v x (F) = for 
P € y is a 'P-map. In particular, we obtain a map 7r : x i-» v x from X to the set of 
all possible 'P-maps. 

Now, given any three !P-maps V\,v 2 , V3, the quasi-median g(vi,V2,V3) is defined 
to be the !P-map 

fv 2 (P), ifv 2 (P) = v 3 (P), 
Ivi(P), otherwise 

for P £ P. The quasi-median hull //(O) of a set O of ^"-maps is the smallest set of 
"P-maps closed under taking quasi-medians, or, more formally, //(O) = [J fe0 /?,-(<!>), 
where 

# ($) = O and = {q(v u v 2 ,v 3 ) \ v l ,v 2 ,v 3 £ ff/-i(0)} . 

The quasi-median graph Q(P) of a partition system !P on X has vertex set H(n(X)) 
and edge set consisting of all those pairs {vi,V2} of "P-maps in H(jt(X)) that differ 
on precisely one partition, that is, \{P £ P\v\(P) ± v 2 CP)}| = 1. 

Example 2.2. The quasi- median graph of the partition system described in Ex- 



P h-> 



ample |2.1| is depicted in Figure the map n gives the labelling of the black 
vertices in the graph by the sequences si to S[q. For example, the vertex e 4 maps 
partition P 3 to {si, s 5 , s^, s w } and partition P 6 to {s\, s 2 , s 3 , 54, Sj, s%, s$). 



3. Strong compatibility and quasi-median graphs 

We now consider a concept that is useful for understanding the structure of 
quasi- median graphs (cf. (Bandelt et al. [6])). Two partitions P, Q of X are called 
strongly compatible if either P = Q or there exist A £ P,B £ Q such that AU B = X 
(see [HJ p. 3]). Obviously, if distinct partitions P, Q of X are strongly compatible, 
then the sets A and B are necessarily unique; we set B(P, Q) = A and B(Q, P) = B. 
The following observation concerning these sets will be useful later. 
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Lemma 3.1. Let P, Q,R be distinct partitions of a set X such that P and Q are not 
strongly compatible and P,Q are both strongly compatible with R. Then B(R,P) = 
B(R,Q). 

Proof. Since R and P are strongly compatible, we have B(R,P) U B(P,R) = X. 
If B(R, Q) * B(R,P), this implies B(R, Q) c B{P,R). So we get B{Q,R) U B(P,R) 2 
B(Q, R)UB(R, Q) = X; a contradiction to P and 2 not being strongly compatible. □ 

A partition system P on X is called strongly compatible if each P,Q e P are 
strongly compatible. The following result, which is shown in the proof of [10] 
Lemma 3.1], will be useful later on for obtaining bounds on the number of cut 
vertices in a quasi-median graph. 

Proposition 3.2. LetX be a set of cardinality n > 2 andP be a strongly compatible 
set of partitions of X. Then \P\ < 3n — 5. 

We now consider a graph that will be key for our description of the block de- 
composition of a quasi-median graph. The non- strong- compatibility graph for a 
partition system P on X (Bandelt and Diir [3])) is the graph with vertex set P and 
edge set 

UP Q] I P and Q are not strongly compatible} . 
Properties of this graph have also been considered in (Schwarz and Diir |17j). 



Example 3.3. We continue Example 2.1 The non-strong-compatibility graph 



of the partition system is depicted in Figure |3.1| For example, the partitions P\ 
and P 5 are strongly compatible with B(Pi,P 5 ) = {s 3 , s 4 , s 5 , s$, s 9 , s w }, B(P 5 ,Pi) = 
{s l: s 2 , s 3 , s 4 , St, s%, s 9 , s w }. Similarly, Pi and P$ are strongly compatibly and - as 
required by Lemma [3~l] - B(P l: P 6 ) = B(Pi,P 5 ). On the other hand, P 3 and P$ are 
not strongly compatible, as we cannot find elements of the partitions whose union 
is X, which gives the edge {3, 8} in the non-strong-compatibility graph. 




©00© 
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Figure 3.1. The non-strong-compatibility graph for the set of par- 
titions in Example |2.1 A vertex labelled i corresponds to partition 
Pu 1 < i < 12. 



We now present some useful links between strong compatibility and quasi- 
median graphs. The following result was proved in (|6l Theorem 1]). 

Theorem 3.4. Let P be a set of partitions of X. Then a P-map tp is a vertex 
of the quasi-median graph Q(P) if and only if for every pair of distinct, strongly 
compatible partitions P\,P2 € P either <p(Pi) = B(P[,P 2 ) or ipiPz) = B(P 2 ,P{)- 
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Denote the complete graph on n vertices by K n , and, for two graphs G,H, let 
GuH denote the (Cartesian) product of G and H, that is, the graph with vertex 
set V(G) x V(H) and edge set {{(u, v), (u, w)} \ [v, w} £ E{H)} U {{(«, w), (v, w)} | [u, v} £ 
E(G)}. In the extreme case of pairwise strong-compatibility and non strong- 
compatibility for a set of partitions, we have the following descriptions of the 
quasi- median graph (see [6l Theorem 2, Corollary 1]). 

Theorem 3.5. Let P be a set of partitions ofX. Then 

(i) If every pair P,Q £ P is strongly compatible, then Q(P) is a block graph, 
that is, every block in Q(P) is isomorphic to a complete graph. 

(ii) If no distinct P, Q £ P are strongly compatible, then Q(P) is isomorphic to 
u Pe pK\ P \ . 



4. Cut vertices and blocks in quasi-median graphs 

We now turn to understanding the cut-vertices and blocks of a quasi-median 
graph. By definition, for each edge e = {vi,v 2 } of the quasi-median graph of a set 
of partitions P of X, there exists exactly one Pep such that Vi(P) ^ V2(P). We say 
that P is the partition corresponding to e. Given a block B of Q(P) we denote by 
V (B) the set of all P £ P that correspond to some edge of B. The following result 
that relates the connected components of the non-strong-compatibility graph of P 
with the blocks of Q(P) is the key component to all that follows. Note that it has 
been proved in the special case where all partitions in P(B) have cardinality two 
in H21. 



Theorem 4.1. Let X be a finite set and P be a partition system on X. Then the 
blocks of the quasi-median graph of P are in bisection with the connected compo- 
nents of the non- strong- compatibility graph of P. More specifically, a bisection is 
given by mapping each block B of the quasi-median graph Q(P) to the (necessarily) 
connected component of the non- strong- compatibility graph whose vertex set equals 
P(B). 

Proof. We prove the theorem by induction on \P\, the base case \P\ = 1 being 
obvious. 

Choose some P £ P and set P' = P \ {P}. By the induction hypothesis, the 
blocks of Q(P') are in bijection with the connected components of the non-strong- 
compatibility graph of P ' . First suppose that P' is strongly compatible to all P' £ 
P ' . Obviously, the non-strong-compatibility graph of P is derived from the non- 



strong-compatibility graph of P' by adding the isolated vertex P. By Theorem 3.4 
the vertices of Q(P) are either just vertices of the subgraph isomorphic to Q(P') 
or those ^-maps v defined by 



v(0 



\B(Q,P), iiQ£p', 
I A, otherwise, 
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for some A e P. There can be only one vertex which is of both types, and this is 
the cut vertex separating the two types of vertices and hence the new block where 
all edges correspond to P from the other blocks. The existence of the bijection 
now follows from the induction hypothesis. 

Now suppose P is not strongly compatible to some Q e P' . It follows from 
Theorem 13.51 (ii) that all edges corresponding to P and Q must be in the same 



block. Hence, all blocks of Q(P') containing partitions not strongly compatible to 
P are joined together to a new block also containing P. The same happens for the 
non-strong-compatibility graph, yielding the result. □ 



Example 4.2. Considering Example 2.1, we see that the non-strong-compatibility 



graph in Figure 3.1 has eight connected components: One whose vertex set consists 
of the partitions Pi,P 2 , P3 and P&, one containing the partitions P5 and P(,, and six 
isolated vertices corresponding to the remaining partitions. This is in accordance to 



the eight blocks of the quasi-median graph in Figure 1.1 these being the large block 
in the middle of the graph, corresponding to Pi, P2, P3 and P$, the block on the 
left isomorphic to the Cartesian product of an edge and a triangle, corresponding 
to P 5 and Pg, two triangular blocks corresponding to the partitions P 7 and P12 
each having three parts, and five edges corresponding to partitions P 4 , P 9 , P10 and 
Pn each having two parts. 



It follows from Theorem 4^ that the collection of sets P(B) over all blocks B of 
Q(P) defines a partition Part(!P) of P, and that the following result holds that will 
be useful later. 

Corollary 4.3. Let P be a partition system of X with \P\> \, P eP,P' := P\{P} 
and I(P',P) := {Q e P' | Q not strongly compatible to P). Then we have 

PartCP) = {<R 6 PartCP') | I(<R, P) = 0} U [[J {K e PartCP') | I(<R, P) * 0} U {P}} . 

In particular, ifI(P',P) = ; we have Part(P) = Part(f') U {{P}}. 



Also, by Theorem 4.1 and Proposition 3.2 the following bounds on the number 



of cut vertices and blocks in a quasi-median graph must hold; this will be useful 
for establishing run time bounds for our main algorithm. 

Corollary 4.4. Let X be a set of cardinality n > 2 and P be a set of partitions of 
X. Then Q(P) has at most 3n — 5 blocks and at most 3n — 6 cut vertices. 

We conclude this section by presenting a characterisation for the cut vertices in 
a quasi-median graph that is of independent interest, and will not be used later. 
First we prove a useful observation. 

Lemma 4.5. Let P be a partition system ofX and v a cut vertex of Q(P). Suppose 
that P\,P2 s P are distinct and that P,- corresponds to an edge in the subgraph 
induced by Q(P) on the set V(Cj) U {v} ; i = 1,2, where C\,C 2 are two distinct 
connected components of the graph Q(P) with v removed. Then Pi,P 2 are strongly 
compatible, and v(Pj) = B(P l ,P 2 ), v(P 2 ) = B(P 2 ,P l ) both hold. 
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Proof. Since P\,P 2 must be contained in distinct blocks of QCP), it immediately 



follows by Theorem |4. 1| that Pi and P2 are strongly compatible. 

Now, by Theorem ^4 we can assume without loss of generality that v(P\) = 
B(Pi,P2)- Let be an edge in QCP) that corresponds to P\. Without loss of 

generality, we can assume that there is path in QCP) from w to v such that no edge in 
this path corresponds to Pi or P 2 . In particular, we have w(P\) = v(Pi) = B(Pi,P 2 ). 
Moreover, w'(Pi) * B{P U P 2 ) and so by Theorem |3ii| w'(P 2 ) = B{P 2 ,Pi). But, 
w'(P 2 ) = v(P 2 ) as, by Theorem 4.1, the block containing all edges corresponding 
to P 2 must be contained in the subgraph induced by Q(P) on the set V(C 2 ) U {v}. 
This completes the proof of the lemma. □ 

We now present the aforementioned characterisation of cut vertices. Note that 
it generalises a characterisation of cut vertices in median graphs given in [12] . 



Theorem 4.6. Let P be a partition system of X and v be a vertex of QCP). Then 
v is a cut vertex of QCP) if and only if the graph G v with vertex set P and edge set 
{{P, Q) I P, Q 6 P, P * Q and v(P) U v(Q) * X} is disconnected. 

Proof. Suppose that v is a cut vertex of QCP)- Then it follows immediately by 
Theorem 4^ and Lemma 4J3 that G v is disconnected. 

Conversely, suppose that G v is disconnected, and, for contradiction, that v is 
not a cut vertex of QCP). Note that the non-strong compatibility graph of P is a 
subgraph of G v . Hence the non-strong compatibility graph of P is disconnected. 



Therefore, by Theorem 4.1 there are at least two blocks in QCP). 



Now, suppose B is the block of QCP) containing v. By Theorem 4^ there must 
exist some block B' ± B of QCP) such that P(B') is contained in the vertex set of 
some connected component of G v that is not equal to the connected component of 
G v whose vertex contains P(B). Let w be the cut vertex of QCP) contained in B 
which lies on a shortest path from v to some vertex in B' . Let P £p correspond to 
the edge on this path incident with w (which must exist as v is not a cut vertex), 
and let F e P(B'). Then, by Lemma gjj w(P) = B(P,F) and w(F) = B(F,P). 
Moreover, by Theorem |4Tj w(F) = v(P') and w(P) £ v(P). Hence v(P)Uv(P') £ X, 
which is a contradiction as P and P' are in distinct components of G v . □ 



5. The block decomposition of a quasi-median graph 

As stated in the introduction, we want to determine the blocks of the quasi- 
median graph Q(P) of a partition system P without having to compute QCP) itself. 
To do this, rather than computing the blocks of QCP) directly, we shall compute 
some sets associated with each block which we now define. 

Given a block B of QCP), we let X(B) = V(B) n n(X) denote the set of vertices in 
B labelled by elements in X, P(B) the set of partitions in P corresponding to edges 
of B and S(B) the set of cut vertices of QCP) that are in B but not in X(B). Note 
that X(B) or S (B) can be empty, but that X(B) U S (B) is never empty. We will also 
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consider the set P,(B) of partitions of the set X(B) U S (B) that is induced by, for 
each P e P(B), removing all those edges in B that correspond to P. 

Example 5.1. For the large block B in the middle of the quasi-median graph in 



Example [2TT| we have X(B) = {s 7 ,s s }, S(B) = {e u e 4 ,e 5 }, P(B) = {P u P 2 ,P i ,P % \ and 
P r (B) = {P' v P' 2 ,F 3 ,P' g }, where 

P\ = {{s 7 ,ss,e 5 },{ei,e 4 }}, P' 2 = {{e u e 5 , s s },{sj,e A }} , 

P' 3 = {{S7,s 8 ,ei},{e 4 ,e 5 }}, /" 8 = {{s 1 ,e 1 ,e 4 }, {s 8 ,e 5 }} . 

Now, we define the block decomposition 1B(P) of the quasi-median graph of a 
partition system P on the set X to be the set 

{(X(B),S(B),P r (B)) | B is a block of Q(P)} . 

Our main aim is to compute this decomposition without having to compute Q(P). 
Note that in view of the following lemma we can always reconstruct Q(P) from 
S(P). 

Lemma 5.2. Given a partition system P and a block B of Q(P), the quasi-median 
graph Q(P r (B)) is isomorphic to B. 

Proof. By definition, a !P-map v is a vertex of the block B if and only if v is 
contained in some edge of Q(P) corresponding to an element of P(B). Consider 
now the P r (B)-map V that maps a partition P' e P r (B) to the same part that v 
maps the partition P. This is a vertex of Q(P r (B)) and it can be easily seen that 
the map vhv' induces the desired isomorphism between B and Q(P r (B)). □ 

Remark 5.3. In [T71 Theorem 3], Schwarz and Diir define what they call the Block 
Decomposition of a Quasi-Median Network. However, they do not use the notion 
of block in the usual graph theoretical way. Instead, they work with a notion that 
is suitable for their aim of visualising quasi-median graphs. In particular, their 
blocks depend on an arbitrary vertex of the quasi-median graph which can be 
chosen in a suitable way to obtain improved visualisations. 

In what follows, we shall not directly compute the the block decomposition of 
Q(P), but instead some closely related data from which the block decomposition 
can be easily computed. 

To this end, let S(P) denote the union of all S(B) with B a block of Q(P); we call 
any element in S(P) an extra vertex. For v e S(P) we denote the set of all blocks 
B in Q(P) with v e S (B) by B(v). An element x e X is in the direction of B with 
respect to v 6 S (B) if every path from xtov has an edge in B. Note that since all 
vertices of Q(P) are elements of the quasi-median hull of tt(X), there always exists 
such an element x(v, B) although this element is not necessarily unique. 

Lemma 5.4. Suppose that P is a partition system on X and B is a block of Q(P). 
If we are given the sets X(B), S(B), P(B) and, for each v 6 S(B), some element 
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x(v, B) in the direction of B with respect to v, then we can obtain the setP,.(B) from 
the set P(B) in time 0(nm), where n = \X\, m = \P\. 

Proof. For each partition P € P(B) we construct a partition P' of X(B) U S(B) as 
follows. Elements of X(B) are in that part of P' that they are in P. For each 
v e S (B) we choose some C 6 B(y) \ [B] and put v in that part of P' that x(y, B) is in 
P. Repeating this for all partitions P e P(B) gives us the set P r (B). This procedure 
can be carried out in time 0(mn), giving the desired run time bound. □ 

Example 5.5. To compute P r (B) from P(B) and the information x(y, B) for all 



v € S(B) for the block B in Example 5.1, assume that x(ei,B 7 ) = ST,,x{e^,B 4 ) = 55 



and x(e5,B w ) = s\, where, for this moment, we denote by B t the block containing 
the (sole) partition P t . 

Now, we start out with partition Pi and have to check in which part of the 
partition the extra points ei,e 4 and e 5 are contained. Since x(e\,B-i) = S3, we 
substitute 53 for e\ in P\ and, similarly, we substitute s 5 for e 4 and si for e 5 . 
Deleting all x 6 X \ X(B) in the remaining partition yields the partition P' x . After 
performing the same process for P2,P3 and P&, we obtain the set P r (B). 

So, to compute the block decomposition of the quasi-median graph of a partition 
system P it suffices to compute, for each block B of Q(P), the sets X(B), S(B) and 
P(B), and also, for each v 6 S(B) and B e B(v), some element x(v, B) in the direction 
of B with respect to v. In the next section we shall present an algorithm for doing 
precisely this. 



6. Computing the block decomposition of a quasi-median graph 

We now present our approach to computing the block decomposition of a parti- 
tion system P following the strategy presented at the end of the last section. We 
start with the block decomposition of an empty set of partitions on X (which is 
itself empty) and iteratively add each P e P to build up the decomposition. In 
particular, at each stage, for each block B (either existing or new) we compute the 
sets X(B), S(B), P(B), together with elements x(y,B), v e S(B), B e B(v). To do 
this we use Algorithm [TJ the main elements for which are as follows. 

First, for each given block B, we check whether or not there exists some partition 
in P(B) that is not strongly compatible to the newly added partition P and thereby 
also compute which elements of X must be added to our new block. This is 
done in the function is_compatible described in Algorithm [2j This function 
returns TRUE if the new partition P is compatible to all partitions Q in the block 
B. All blocks B with is_compatible(.P, 5)=TRUE remain blocks for the new block 
decomposition, and all other blocks are joined (together with P) to form a new 
block that is added to the decomposition. This is done in the function j oin_blocks 
outlined in Algorithm [3] 

We now prove that this approach really works: 
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Algorithm 1: Algorithm to add a partition. 



Input: The set & = {(X(B),S(B),P(B)) : B a block of Q(P)} for a partition 

system P and, for each v 6 S(B) and B e B(v), some element x(v, B) in 
the direction of B with respect to v, together with some partition 

Pi P. 

Output: The same data for P U {P}. 

1 Create a new block C with X(C) = X, 5(C) = 0, P{C) = {P}; 

2 Create a new extra vertex v; 



3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 

14 
15 
16 
17 
18 
19 
20 
21 
22 



0; 



incomp 

foreach BeSdo 



if /is_compatible(.P, 5) then 



Add B to S 



incomp i 



end 
else 



Choose some 2 6 P (5); 
X(5)<-X(B)nfl(P,0; 
if £ and X(C) n = then 
Choose some 6 B(P, Q); 

if T/iere exists some v e S (C) swc/i £/ia£ x(y, B) and x are in the 
same part of P then 

w <— v; 
end 
else 

w <- 
end 
x(w, B) 
x(w, C) 



new extra vertex; 



add_extra_vertex(w,B); 
add_extra_vert ex(w,C); 

end 
end 

23 end 

24 if S incomp * then 

25 | add_blocks(C,S incomp ); 

26 end 

27 return S U {(Z(C), 5 (C), ^(C))} and the elements x(w,Q; 



Theorem 6.1. Algorithm^ is correct. 

Proof. We first show that if the sets P(B) and X(B) have been correctly computed 
for all blocks B of the quasi-median graph of the partition system P \ {P}, then 
they are correctly computed for P. 
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To see that !P(5) is correct, note that the set P(C) for the new block C is 
initialised as {P} and in the function add_blocks all partitions of blocks contain- 
ing partitions incompatible to P are added and the corresponding blocks deleted. 
Hence, it follows from Corollary 4J3 that V{B) is correct for all blocks of Q(P). 

We now turn to the correctness of X(B). Consider first a block B for which every 
partition Q e V(B) is strongly compatible with P. The elements of in X(B) stay in 
X(B) if they are in B(P, Q), and similarly move to X(C) for the new block C if they 
are in B(Q,P). But, by Theorem 3.5| (i), the quasi-median graph Q({P, Q}) has two 
blocks B,C with P(B) = {P}, X(B) = B(Q,P) and P(C) = {£>}, X(Q) = B(P,Q). It 
follows that X(B) is correct. Otherwise, if some Q e V{B) is not compatible to P, 
then the corresponding block is deleted and all elements are simply joined to those 
in X(C), as required. So, using a similar argument for Q({P, Q}), it follows that 
X(C) is also correct. 

It remains to show that the blocks are added in a proper way, that is, all of 
the extra vertices are contained in the blocks that they really belong to. This is 
taken care of by the condition in Line 11 of Algorithm [I] There is no need to 
add extra vertices for adding two blocks if they already share an element of X and 
having X(B) £ B(P, Q) ensures that blocks are only added if needed. Moreover, 
Algorithm [4] ensures that elements in the direction of some block are computed. 
Indeed, suppose all existing x(-, •) are correct. To see that Algorithm [4] returns an 
element of x that is in the direction of B first not that if x e X(B) and X(B) ^ 0, then 
x is clearly in the direction of B with respect to v. Furthermore, every w 6 5(5)\{v} 
is in the direction of B with respect to v and so every element in the direction of 
any C 6 B(w) \ {B} with respect to w is in the direction of B with respect to v. This 
completes the proof of the theorem. □ 



Algorithm 2: Check if a partition is compatible with all partitions arising 
from a block. 

1 is_compatible(.P, B) 

2 foreach Partition Q e 'P(.B) do 

3 if P and Q are strongly compatible then 

4 | X(C)^X(C)nB(Q,P); 

5 end 

6 else 

7 | return FALSE; 

8 end 

9 end 

10 return TRUE: 



We conclude with an analysis of the run time of Algorithm [TJ First, we compute 
the time needed to check whether two partitions are compatible. 
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Algorithm 3: Add all blocks incompatible to P. 



1 add_blocks(C,S incomp ) 

2 X(C) <- 0; 

3 foreach w e S (C) do 



4 
5 
6 

7 end 



if B(w) U S incomp = then 
j Delete the extra vertex w from S(C); 
end 



8 foreach B e S incomp do 

9 
10 

n 

12 
13 
14 
15 

16 end 



Remove B from S; 
X(C) <- X(C) U X(B); 
P(C) <- f (C) U f>(fl); 
foreach weS (5) do 
Add w to 5(C); 
x(w, C) <— x(w, B) ; 
end 



Algorithm 4: Add an extra vertex to a block. 



1 add_extra_vertex(v,5) 

2 Add v to S(B); 

3 if * then 

4 Choose some x e 

5 return x; 

6 end 

7 Choose some w e S(B) \ {v}; 

8 Choose some C e \ {5}; 

9 return x(w, C); 



Lemma 6.2. Let P and 2 be partitions of X with \P\ = k\, \Q\ = k2 and \X\ = n. 

Then checking strong compatibility and computing B(P, Q) and B(Q, P) in case they 
are compatible can be done in time Oikfari). 

Proof. For each A e P and B e Q we check if the A U B = X. If this is the case, P 
and Q are compatible and B(P, Q) = B, B(Q, P) = A. Such a check can be done in 
linear time in n and there are k\ • ki such pairs. □ 

Theorem 6.3. The algorithm computes the block decomposition of a partition 
system V on X in time 0(k 2 n 2 m 2 ), where n = \X\, m = \P\ and k = max{|.P| | P e V }. 
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Proof. We claim that Algorithm [T] runs in time 0{k 2 n 2 m). Since this algorithm 
is executed once for each partition, the theorem then follows by Lemma |5.4| and 



Corollary 4.4 



It follows from Lemma 6.2 that the function is_compatible in Algorithm [2] 
runs in time 0(k 2 n ■ \P (B)\). The rest of the first loop in Algorithm [I] is dominated 
by the condition in Line 13 How ever, since the number of extra vertices of QiV) 
is linear in n by Proposition 4.4, this test can be performed in 0(n 2 ). Since each 



partition can only be in one block, this shows that the loop in Algorithm [T] needs 
0((k 2 n + n 2 )m) time. For the function add_blocks the run time of the first loop is 
bound by 0(n 2 ), taking into account that by Proposition 4.4 the number of extra 



vertices and the number of blocks are linear in n. The same holds for the second 
loop, so add_blocks runs in time Oin 2 ). Altogether, we get that Algorithm [l] runs 
in time 0((k 2 n + n 2 )m + n 2 + kn 2 ) = 0(k 2 n 2 m), as claimed. □ 



Note that, translated into the language of sequences used in the introduction, 
this results implies that the block decomposition of the quasi-median graph of n 
aligned sequences of length m over an alphabet with / characters can be computed 
in time 0(l 2 n 2 m 2 ), as the size of each corresponding partition, and hence k, is 
bounded by /. 
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